-
Notifications
You must be signed in to change notification settings - Fork 2.7k
JIT: fix filter liveness computation #23044
Conversation
When a filter is finished executing, control can logically pass to the associated handler, any enclosing handler or filter, or any finally or fault handler nested within the associated try. This is a consequence of two-pass EH. The jit was not propagating liveness from the nested handlers, which lead to a live object being collected inadvertently. This change updates `fgGetHandlerLiveVars` to find the nested handlers and merge their live-in into the filter block live sets. Because these implicit EH flow edges can create cycles in the liveness dataflow equations, the jit will also now always iterate liveness when it sees there is exception flow, to ensure livness reaches the appropriate fixed point. Added test case. Closes #22820.
@BruceForstall PTAL
Diffs on x86 on the test case and in a handful of places in corelib and fx.
Corelib text diffs seem to be an interleaving artifact. Looked at random sampling of the other diffs and all the methods with diffs had a filter with a nested try/finally. For example: Size regressions are from extra prolog inits. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One thing that might be helpful is to add an example in the comments, to make it easier to understand the kind of EH structure that we're dealing with here.
Oops, forgot to have the new test case return 100 on success. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good
Actually I was wrong about no diffs on x64 (forgot to rebuild the new jit). Actual diffs are similar to those on x86:
File line diff count is high because GC info doesn't diff very well. Corelib file diff again appears to be an interleaving issue; something has changed recently that breaks jit-diff determinism, or maybe it's an issue with the |
Ubuntu test builds failing in msbuild with errors like:
@dotnet-bot retest Ubuntu x64 Checked Innerloop Build and Test OSX with its usual hangup issue. @dotnet-bot retest OSX10.12 x64 Checked Innerloop Build and Test Throwing in some GC stress legs... @dotnet-bot test Windows_NT x86 Checked gcstress0xc |
The change doesn't look right for non-x86 (tried on x64). It does what I was afraid of - it causes double reporting of the problematic variable during GC scan. As i've mentioned in https://github.com/dotnet/coreclr/issues/22820#issuecomment-469859758, the variable is not expected to be reported by the GC info as live in the filter, since when the filter runs, we have the frame that has thrown the exception on stack and so we report it during the stack walk twice - once from the filter frame and second time from the frame of the throw.
The stack walker for non windows-x86 has an extremely complicated logic full of assumptions about what GC info does report and what it doesn't in funclets and filters and their parent frames. So while we could possibly modify it to reflect this new way of reporting liveness and prevent the double reporting, I would like to avoid touching that code as much as I can. Bugs in this code usually result in very hard to reproduce and debug GC holes. So I would prefer modifying the change to be x86 windows only (not even x86 Unix as that one uses the same EH as x64, arm and arm64 everywhere, the x86 windows is an exception). @jkotas do you have any opinion here? |
Here is a GC stack walk log on x64 showing the double reporting (see
|
I've spent more time debugging on x86 and I have realized that I got mislead by the fact that the "k" windbg command was not showing the frame where the "throw" happened. It turns out the frame is still there and our stack walker still walks it. However, I am a bit torn here, as your current change that reports the variables in the filter on all platforms is actually better tracking the object lifetime reality. So it seems more "correct". Moreover, I've found that the double reporting on the other platforms doesn't cause trouble thanks to the fact that it is reported as "pinned" for the filter. It seems it just adds some overhead. |
fyi, there is a discussion of GC, filters, and pinning here: https://github.com/dotnet/coreclr/blob/master/Documentation/botr/clr-abi.md#filter-gc-semantics |
There was already an edge case for extending liveness reporting in filters in funclet EH models: Lines 1071 to 1082 in 5a26509
And there is this bit of compensating logic in the jit GC encoder: Lines 107 to 128 in 5a26509
So the pinning is evidently the thing that makes the double reporting work out. |
@BruceForstall thank you for the pointer to the doc. Then this change is definitely correcting the existing state according to the doc and should go in as is. |
@AndyAyersMS I think we should port this to 2.1 (and 2.2 unless changes from 2.1 flow automatically to 2.2). |
I was thinking along those lines too. Let me look into it. |
Port of dotnet#23044 to release/2.1. When a filter is finished executing, control can logically pass to the associated handler, any enclosing handler or filter, or any finally or fault handler nested within the associated try. This is a consequence of two-pass EH. The jit was not propagating liveness from the nested handlers, which lead to a live object being collected inadvertently. This change updates `fgGetHandlerLiveVars` to find the nested handlers and merge their live-in into the filter block live sets. Because these implicit EH flow edges can create cycles in the liveness dataflow equations, the jit will also now always iterate liveness when it sees there is exception flow, to ensure livness reaches the appropriate fixed point. Added test case. Closes #22820.
When a filter is finished executing, control can logically pass to the associated handler, any enclosing handler or filter, or any finally or fault handler nested within the associated try. This is a consequence of two-pass EH. The jit was not propagating liveness from the nested handlers, which lead to a live object being collected inadvertently. This change updates `fgGetHandlerLiveVars` to find the nested handlers and merge their live-in into the filter block live sets. Because these implicit EH flow edges can create cycles in the liveness dataflow equations, the jit will also now always iterate liveness when it sees there is exception flow, to ensure livness reaches the appropriate fixed point. Added test case. Closes dotnet/coreclr#22820. Commit migrated from dotnet/coreclr@f6cc013
When a filter is finished executing, control can logically pass to the
associated handler, any enclosing handler or filter, or any finally or fault
handler nested within the associated try. This is a consequence of two-pass EH.
The jit was not propagating liveness from the nested handlers, which lead to a
live object being collected inadvertently.
This change updates
fgGetHandlerLiveVars
to find the nested handlers andmerge their live-in into the filter block live sets.
Because these implicit EH flow edges can create cycles in the liveness dataflow
equations, the jit will also now always iterate liveness when it sees there is
exception flow, to ensure livness reaches the appropriate fixed point.
Added test case.
Closes #22820.